|
Energy distance is a statistical distance between probability distributions. If X and Y are independent random vectors in ''R''d with cumulative distribution functions F and G respectively, then the energy distance between the distributions F and G is defined to be the square root of : where X, X' are independent and identically distributed (iid), Y, Y' are iid, is expected value, and || . || denotes the length of a vector. Energy distance satisfies all axioms of a metric thus energy distance characterizes the equality of distributions: D(F,G) = 0 if and only if F = G. Energy distance for statistical applications was introduced in 1985 by Gábor J. Székely, who proved that for real-valued random variables this distance is exactly twice Harald Cramér's distance:〔Cramér, H. (1928) On the composition of elementary errors, Skandinavisk Aktuarietidskrift, 11, 141–180.〕 : . For a simple proof of this equivalence, see Székely and Rizzo (2005).〔 (Reprint )〕 In higher dimensions, however, the two distances are different because the energy distance is rotation invariant while Cramér's distance is not. (Notice that Cramér's distance is not the same as the distribution-free Cramer-von-Mises criterion.) ==Generalization to metric spaces== One can generalize the notion of energy distance to probability distributions on metric spaces. Let be a metric space with its Borel sigma algebra . Let denote the collection of all probability measures on the measurable space . If μ and ν are probability measures in , then the energy-distance of μ and ν can be defined as the square root of : This is not necessarily non-negative, however. If is a strongly negative definite kernel, then is a metric, and conversely.〔Klebanov, L. B. (2005) N-distances and their Applications, Karolinum Press, Charles University, Prague.〕 This condition is expressed by saying that has negative type. Negative type is not sufficient for to be a metric; the latter condition is expressed by saying that has strong negative type. In this situation, the energy distance is zero if and only if X and Y are identically distributed. An example of a metric of negative type but not of strong negative type is the plane with the taxicab metric. All Euclidean spaces and even separable Hilbert spaces have strong negative type.〔 ()〕 In the literature on kernel methods for machine learning, these generalized notions of energy distance are studied under the name of maximum mean discrepancy.〔 ()〕 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Energy distance」の詳細全文を読む スポンサード リンク
|